Search CORE

371 research outputs found

A Conserative Property of a Nested Relational Query Language

Author: Wong Limsoon
Publication venue: ScholarlyCommons
Publication date: 01/07/1992
Field of study

We proposed in [7] a nested relational calculus and a nested relational algebra based on structural recursion [6,5] and on monads [27,16]. In this report, we describe relative set abstraction as our third nested relational query language. This query language is similar to the well known list comprehension mechanism in functional programming languages such as Haskell [ll], Miranda [24], KRC [23], etc. This language is equivalent to our earlier query languages both in terms of semantics and in terms of equational theories. This strong sense of equivalence allows our three query languages to be freely combined into a nested relational query language that is robust and user-friendly

ScholarlyCommons@Penn

Controlling False Positives in Association Rule Mining

Author: Liu Guimei
Wong Limsoon
Zhang Haojun
Publication venue
Publication date: 01/10/2011
Field of study

Association rule mining is an important problem in the data mining area. It enumerates and tests a large number of rules on a dataset and outputs rules that satisfy user-specified constraints. Due to the large number of rules being tested, rules that do not represent real systematic effect in the data can satisfy the given constraints purely by random chance. Hence association rule mining often suffers from a high risk of false positive errors. There is a lack of comprehensive study on controlling false positives in association rule mining. In this paper, we adopt three multiple testing correction approaches---the direct adjustment approach, the permutation-based approach and the holdout approach---to control false positives in association rule mining, and conduct extensive experiments to study their performance. Our results show that (1) Numerous spurious rules are generated if no correction is made. (2) The three approaches can control false positives effectively. Among the three approaches, the permutation-based approach has the highest power of detecting real association rules, but it is very computationally expensive. We employ several techniques to reduce its cost effectively.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

ScholarBank@NUS

A Bounded Degree Property and Finite-Cofiniteness of Graph Queries

Author: Libkin Leonid
Wong Limsoon
Publication venue: ScholarlyCommons
Publication date: 01/01/1993
Field of study

We provide new techniques for the analysis of the expressive power of query languages for nested collections. These languages may use set or bag semantics and may be further complicated by the presence of aggregate functions. We exhibit certain classes of graphics and prove that properties of these graphics that can be tested in such languages are either finite or cofinite. This result settles that conjectures of Grumbach, Milo, and Paredaens that parity test, transitive closure, and balanced binary tree test are not expressible in bah languages like BALG of Grumbach and Milo and BQL of Libkin and Wong. Moreover, it implies that many recursive queries, including simple ones like test for a chain, cannot be expressed in a nested relational language even when aggregate functions are available. In an attempt to generalize the finite-cofiniteness result, we study the bounded degree property which says that the number of distinct in- and out-degrees in the output of a graph query does not depend on the size of the input if the input is simple. We show that such a property implies a number of inexpressibility results in a uniform fashion. We then prove the bounded degree property for the nested relational language

CiteSeerX

ScholarlyCommons@Penn

Relational Foundations For Functorial Data Migration

Author: Abiteboul Serge
Hsiang Jieh
Jacobs Bart
Wong Limsoon
Publication venue
Publication date: 24/07/2015
Field of study

We study the data transformation capabilities associated with schemas that are presented by directed multi-graphs and path equations. Unlike most approaches which treat graph-based schemas as abbreviations for relational schemas, we treat graph-based schemas as categories. A schema

S

is a finitely-presented category, and the collection of all

S

-instances forms a category,

S

-inst. A functor

F

between schemas

S

and

T

, which can be generated from a visual mapping between graphs, induces three adjoint data migration functors,

\Sigma_F:S

-inst

\to T

-inst,

\Pi_F: S

-inst

\to T

-inst, and

\Delta_F:T

-inst

\to S

-inst. We present an algebraic query language FQL based on these functors, prove that FQL is closed under composition, prove that FQL can be implemented with the select-project-product-union relational algebra (SPCU) extended with a key-generation operation, and prove that SPCU can be implemented with FQL

arXiv.org e-Print Archive

Crossref

Comparative analysis and assessment of M. tuberculosis H37Rv protein-protein interaction datasets

Author: Wong Limsoon
Zhou Hufeng
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

10.1186/1471-2164-12-S3-S2010th Int. Conference on Bioinformatics - 1st ISCB Asia Joint Conference 2011, InCoB 2011/ISCB-Asia 2011: Computational Biology - Proceedings from Asia Pacific Bioinformatics Network (APBioNet)12SUPPL.

Crossref

Springer - Publisher Connector

PubMed Central

ScholarBank@NUS

Methods for protein complex prediction and their contributions towards understanding the organization, function and dynamics of complexes

Author: Patil Ashwini
Srihari Sriganesh
Wong Limsoon
Yong Chern Han
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Complexes of physically interacting proteins constitute fundamental functional units responsible for driving biological processes within cells. A faithful reconstruction of the entire set of complexes is therefore essential to understand the functional organization of cells. In this review, we discuss the key contributions of computational methods developed till date (approximately between 2003 and 2015) for identifying complexes from the network of interacting proteins (PPI network). We evaluate in depth the performance of these methods on PPI datasets from yeast, and highlight challenges faced by these methods, in particular detection of sparse and small or sub- complexes and discerning of overlapping complexes. We describe methods for integrating diverse information including expression profiles and 3D structures of proteins with PPI networks to understand the dynamics of complex formation, for instance, of time-based assembly of complex subunits and formation of fuzzy complexes from intrinsically disordered proteins. Finally, we discuss methods for identifying dysfunctional complexes in human diseases, an application that is proving invaluable to understand disease mechanisms and to discover novel therapeutic targets. We hope this review aptly commemorates a decade of research on computational prediction of complexes and constitutes a valuable reference for further advancements in this exciting area.Comment: 1 Tabl

arXiv.org e-Print Archive

Elsevier - Publisher Connector

University of Queensland eSpace

A Performance Study of Three Disk-based Structures for Indexing and Querying Frequent Itemsets

Author: Liu Guimei
Suchitra Andre
Wong Limsoon
Publication venue: 'VLDB Endowment'
Publication date: 01/05/2013
Field of study

Proceedings of the VLDB Endowment67505-51

CiteSeerX

ScholarBank@NUS

CAMBer: an approach to support comparative analysis of multiple bacterial strains

Author: Tiuryn Jerzy
Wong Limsoon
Wozniak Michal
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

10.1109/BIBM.2010.5706549Proceedings - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010121-12

Crossref

Springer - Publisher Connector

PubMed Central

ScholarBank@NUS

A Flexible Approach to Finding Representative Pattern Sets

Author: Liu Guimei
Wong Limsoon
Zhang Haojun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

10.1109/TKDE.2013.27IEEE Transactions on Knowledge and Data Engineerin

ScholarBank@NUS